7,792 research outputs found
Dynamic Data Mining: Methodology and Algorithms
Supervised data stream mining has become an important and challenging data mining task in modern
organizations. The key challenges are threefold: (1) a possibly infinite number of streaming examples
and time-critical analysis constraints; (2) concept drift; and (3) skewed data distributions.
To address these three challenges, this thesis proposes the novel dynamic data mining (DDM)
methodology by effectively applying supervised ensemble models to data stream mining. DDM can be
loosely defined as categorization-organization-selection of supervised ensemble models. It is inspired
by the idea that although the underlying concepts in a data stream are time-varying, their distinctions
can be identified. Therefore, the models trained on the distinct concepts can be dynamically selected in
order to classify incoming examples of similar concepts.
First, following the general paradigm of DDM, we examine the different concept-drifting stream
mining scenarios and propose corresponding effective and efficient data mining algorithms.
• To address concept drift caused merely by changes of variable distributions, which we term
pseudo concept drift, base models built on categorized streaming data are organized and
selected in line with their corresponding variable distribution characteristics.
• To address concept drift caused by changes of variable and class joint distributions, which we
term true concept drift, an effective data categorization scheme is introduced. A group of
working models is dynamically organized and selected for reacting to the drifting concept.
Secondly, we introduce an integration stream mining framework, enabling the paradigm advocated by
DDM to be widely applicable for other stream mining problems. Therefore, we are able to introduce
easily six effective algorithms for mining data streams with skewed class distributions.
In addition, we also introduce a new ensemble model approach for batch learning, following the same
methodology. Both theoretical and empirical studies demonstrate its effectiveness.
Future work would be targeted at improving the effectiveness and efficiency of the proposed
algorithms. Meantime, we would explore the possibilities of using the integration framework to solve
other open stream mining research problems
Efficient Turbulent Compressible Convection in the Deep Stellar Atmosphere
This paper reports an application of gas-kinetic BGK scheme to the
computation of turbulent compressible convection in the stellar interior. After
incorporating the Sub-grid Scale (SGS) turbulence model into the BGK scheme, we
tested the effects of numerical parameters on the quantitative relationships
among the thermodynamic variables, their fluctuations and correlations in a
very deep, initially gravity-stratified stellar atmosphere. Comparison
indicates that the thermal properties and dynamic properties are dominated by
different aspects of numerical models separately. An adjustable Deardorff
constant in the SGS model and an amplitude of artificial viscosity
in the gas-kinetic BGK scheme are appropriate for current study. We
also calculated the density-weighted auto- and cross-correlation functions in
Xiong's (\cite{xiong77}) turbulent stellar convection theories based on which
the gradient type of models of the non-local transport and the anisotropy of
the turbulence are preliminarily studied. No universal relations or constant
parameters were found for these models.Comment: 13 pages, 8 figures, accepted by ChJA
Turbulent convection model in the overshooting region: II. Theoretical analysis
Turbulent convection models are thought to be good tools to deal with the
convective overshooting in the stellar interior. However, they are too complex
to be applied in calculations of stellar structure and evolution. In order to
understand the physical processes of the convective overshooting and to
simplify the application of turbulent convection models, a semi-analytic
solution is necessary.
We obtain the approximate solution and asymptotic solution of the turbulent
convection model in the overshooting region, and find some important properties
of the convective overshooting:
I. The overshooting region can be partitioned into three parts: a thin region
just outside the convective boundary with high efficiency of turbulent heat
transfer, a power law dissipation region of turbulent kinetic energy in the
middle, and a thermal dissipation area with rapidly decreasing turbulent
kinetic energy. The decaying indices of the turbulent correlations ,
, and are only determined by the parameters of the
TCM, and there is an equilibrium value of the anisotropic degree .
II. The overshooting length of the turbulent heat flux is
about ().
III. The value of the turbulent kinetic energy at the convective boundary
can be estimated by a method called \textsl{the maximum of diffusion}.
Turbulent correlations in the overshooting region can be estimated by using
and exponentially decreasing functions with the decaying indices.Comment: 32 pages, 9 figures, Accepted by The Astrophysical Journa
Critical Success Factors for Effective Knowledge Sharing in Chinese Joint Ventures
Effective knowledge sharing is vital to the success of international joint ventures. To ensure that organizational knowledge in a joint venture can be smoothly communicated and exchanged between its employees in a multi-culture environment, the impact of culture on knowledge sharing needs to be well understood. This paper investigates the impact of culture on knowledge sharing in Chinese joint ventures. Using a multi-case study approach, this paper shows that effective communication, shared mindsets, training and leadership are the critical success factors for effective knowledge sharing in Chinese joint ventures. Such findings facilitate developing specific organizational culture that supports knowledge sharing and can lead to better organizational performance in the increasingly globalized economy
- …